AITopics | information-theoretic generalization bound

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

Neural Information Processing SystemsDec-24-2025, 23:23:15 GMT

In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based on a subset of the training samples. Our approach is broadly applicable within the information-theoretic framework of Russo and Zou (2015) and Xu and Raginsky (2017). Our bound can be tied to a measure of flatness of the empirical risk surface. As compared with other bounds that depend on the squared norms of gradients, empirical investigations show that the terms in our bounds are orders of magnitude smaller.

data-dependent estimate, information-theoretic generalization bound, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Neural Information Processing SystemsDec-24-2025, 13:07:11 GMT

Obtaining generalization bounds for learning algorithms is one of the main subjects studied in theoretical machine learning. In recent years, information-theoretic bounds on generalization have gained the attention of researchers. This approach provides an insight into learning algorithms by considering the mutual information between the model and the training set. In this paper, a probabilistic graphical representation of this approach is adopted and two general techniques to improve the bounds are introduced, namely conditioning and processing. In conditioning, a random variable in the graph is considered as given, while in processing a random variable is substituted with one of its children. These techniques can be used to improve the bounds by either sharpening them or increasing their applicability. It is demonstrated that the proposed framework provides a simple and unified way to explain a variety of recent tightening results. New improved bounds derived by utilizing these techniques are also proposed.

conditioning and processing, information-theoretic generalization bound, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Neural Information Processing SystemsNov-15-2025, 05:51:02 GMT

Bounding the generalization gap is one of the most studied problems in theoretical machine learning.

generalization, information, mutual information, (12 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Neural Information Processing SystemsOct-11-2024, 06:20:08 GMT

Obtaining generalization bounds for learning algorithms is one of the main subjects studied in theoretical machine learning. In recent years, information-theoretic bounds on generalization have gained the attention of researchers. This approach provides an insight into learning algorithms by considering the mutual information between the model and the training set. In this paper, a probabilistic graphical representation of this approach is adopted and two general techniques to improve the bounds are introduced, namely conditioning and processing. In conditioning, a random variable in the graph is considered as given, while in processing a random variable is substituted with one of its children.

conditioning and processing, information-theoretic generalization bound, random variable, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

Neural Information Processing SystemsOct-9-2024, 11:06:04 GMT

In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based on a subset of the training samples. Our approach is broadly applicable within the information-theoretic framework of Russo and Zou (2015) and Xu and Raginsky (2017). Our bound can be tied to a measure of flatness of the empirical risk surface.

data-dependent estimate, gradient, information-theoretic generalization bound, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.85)

Add feedback

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

Tang, Huayi, Liu, Yong

arXiv.org Machine LearningNov-8-2023

In this paper, we develop data-dependent and algorithm-dependent generalization bounds for transductive learning algorithms in the context of information theory for the first time. We show that the generalization gap of transductive learning algorithms can be bounded by the mutual information between training labels and hypothesis. By innovatively proposing the concept of transductive supersamples, we go beyond the inductive learning setting and establish upper bounds in terms of various information measures. Furthermore, we derive novel PAC-Bayesian bounds and build the connection between generalization and loss landscape flatness under the transductive learning setting. Finally, we present the upper bounds for adaptive optimization algorithms and demonstrate the applications of results on semi-supervised learning and graph learning scenarios. Our theoretic results are validated on both synthetic and real-world datasets.

artificial intelligence, generalization, machine learning, (12 more...)

arXiv.org Machine Learning

2311.04561

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

Hellström, Fredrik, Durisi, Giuseppe, Guedj, Benjamin, Raginsky, Maxim

arXiv.org Machine LearningSep-8-2023

A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms, and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands. In this monograph, we highlight this strong connection and present a unified treatment of generalization. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework; analytical studies of the information complexity of learning algorithms; and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.

artificial intelligence, information-theoretic generalization bound, machine learning, (18 more...)

arXiv.org Machine Learning

2309.04381

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > France > Île-de-France > Paris > Paris (0.14)
(51 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Government (0.67)
Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

Negrea, Jeffrey, Haghifam, Mahdi, Dziugaite, Gintare Karolina, Khisti, Ashish, Roy, Daniel M.

Neural Information Processing SystemsMar-19-2020, 01:03:20 GMT

In this work, we improve upon the stepwise analysis of noisy iterative learning algorithms initiated by Pensia, Jog, and Loh (2018) and recently extended by Bu, Zou, and Veeravalli (2019). Our main contributions are significantly improved mutual information bounds for Stochastic Gradient Langevin Dynamics via data-dependent estimates. Our approach is based on the variational characterization of mutual information and the use of data-dependent priors that forecast the mini-batch gradient based on a subset of the training samples. Our approach is broadly applicable within the information-theoretic framework of Russo and Zou (2015) and Xu and Raginsky (2017). Our bound can be tied to a measure of flatness of the empirical risk surface.

data-dependent estimate, gradient, information-theoretic generalization bound, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.92)

Add feedback

Filters

Collaborating Authors

information-theoretic generalization bound

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Conditioning and Processing: Techniques to Improve Information-Theoretic Generalization Bounds

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates

Information-Theoretic Generalization Bounds for Transductive Learning and its Applications

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates